1,067 research outputs found
Recommended from our members
The case for limited-preemptive scheduling in GPUs for real-time systems
Many emerging cyber-physical systems, such as autonomous vehicles, have both extreme computation and hard latency requirements. GPUs are being touted as the ideal platform for such applications due to their highly parallel organisation. Unfortunately, while offering the necessary performance, GPUs are currently designed to maximise throughput and fail to offer the necessary hard real-time (HRT) guarantees.
In this work we discuss three additions to GPUs that enable them to better meet real-time constraints. Firstly, we provide a quantitative argument for exposing the non-preemptive GPU scheduler to software. We show that current GPUs perform hardware context switches for non-preemptive scheduling in 20-26.5μs on average, while swapping out 60-270KiB of state. Although high, these overheads do not forbid non-preemptive HRT scheduling of real-time task sets. Secondly, we argue that limited-preemption support can deliver large benefits in schedulability with very minor impact on the context switching overhead. Finally, we demonstrate the need for a more predictable DRAM request arbiter to reduce interference caused by processes running on the GPU in parallel
On the Reduction of Computational Complexity of Deep Convolutional Neural Networks.
Deep convolutional neural networks (ConvNets), which are at the heart of many new emerging applications, achieve remarkable performance in audio and visual recognition tasks. Unfortunately, achieving accuracy often implies significant computational costs, limiting deployability. In modern ConvNets it is typical for the convolution layers to consume the vast majority of computational resources during inference. This has made the acceleration of these layers an important research area in academia and industry. In this paper, we examine the effects of co-optimizing the internal structures of the convolutional layers and underlying implementation of fundamental convolution operation. We demonstrate that a combination of these methods can have a big impact on the overall speedup of a ConvNet, achieving a ten-fold increase over baseline. We also introduce a new class of fast one-dimensional (1D) convolutions for ConvNets using the Toom-Cook algorithm. We show that our proposed scheme is mathematically well-grounded, robust, and does not require any time-consuming retraining, while still achieving speedups solely from convolutional layers with no loss in baseline accuracy
Configurable memory systems for embedded many-core processors
The memory system of a modern embedded processor con- sumes a large fraction of total system energy. We explore a range of different configuration options and show that a reconfigurable design can make better use of the resources available to it than any fixed implementation, and provide large improvements in both performance and energy con- sumption. Reconfigurability becomes increasingly useful as resources become more constrained, so is particularly rele- vant in the embedded space. For an optimised architectural configuration, we show that a configurable cache system performs an average of 20% (maximum 70%) better than the best fixed implementation when two programs are competing for the same resources, and reduces cache miss rate by an average of 70% (maximum 90%). We then present a case study of AES encryption and decryption, and find that a custom memory configuration can almost double performance, with further benefits being achieved by specialising the task of each core when parallelising the program
Formalizing Reasons, Oughts, and Requirements
Reasons-based accounts of our normative conclusions face difficulties in distinguishing between what ought to be done and what is required. This article addresses this problem from a formal perspective. I introduce a rudimentary formalization of a reasons-based account and demonstrate that that the model faces difficulties in accounting for the distinction between oughts and requirements. I briefly critique attempts to distinguish between oughts and requirements by appealing to a difference in strength or weight of reasons. I then present a formalized reasons-based account of permissions, oughts and requirements. The model exploits Joshua Gert (2004; 2007) and Patricia Greenspan’s (2005; 2007; 2010) suggestion that some reasons perform a purely justificatory function. I show that the model preserves the standard entailment relationships between requirements, oughts and permissions
Augmentation Backdoors
Data augmentation is used extensively to improve model generalisation.
However, reliance on external libraries to implement augmentation methods
introduces a vulnerability into the machine learning pipeline. It is well known
that backdoors can be inserted into machine learning models through serving a
modified dataset to train on. Augmentation therefore presents a perfect
opportunity to perform this modification without requiring an initially
backdoored dataset. In this paper we present three backdoor attacks that can be
covertly inserted into data augmentation. Our attacks each insert a backdoor
using a different type of computer vision augmentation transform, covering
simple image transforms, GAN-based augmentation, and composition-based
augmentation. By inserting the backdoor using these augmentation transforms, we
make our backdoors difficult to detect, while still supporting arbitrary
backdoor functionality. We evaluate our attacks on a range of computer vision
benchmarks and demonstrate that an attacker is able to introduce backdoors
through just a malicious augmentation routine.Comment: 12 pages, 8 figure
- …